Health Policy in the United States

library(data.table)       #Load all required packages
library(rvest)
library(lubridate)
library(dbplyr)
library(DataComputing)
library(magrittr)
library(Hmisc)
library(party)
library(rpart)
rm(list = ls())            #Clean up environment

Background Information

This dataset contains policy data for 50 US states and DC from 2001 to 2017. Data include information related to state legislation and regulations on nutrition, physical activity, and obesity in settings such as early care and education centers, restaurants, schools, work places, and others. To identify individual bills, use the identifier ProvisionID. A bill or citation may appear more than once because it could apply to multiple health or policy topics, settings, or states.

getwd()                          #Get working directory
[1] "/Users/fun.k/Final-Project"
setwd("~/Downloads")             #Set working directory 
The working directory was changed to /Users/fun.k/Downloads inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
untidy <-read.csv("CDC_Nutrition__Physical_Activity__and_Obesity_-_Legislation.csv")    #Read file in table format and create a data frame from it
gc(reset=TRUE)                 #Change from data frame to data table
           used  (Mb) gc trigger   (Mb) limit (Mb) max used  (Mb)
Ncells  3858814 206.1   11679122  623.8         NA  3858814 206.1
Vcells 17992663 137.3  723504616 5520.0      16384 17992663 137.3
tracemem(untidy)
[1] "<0x10ba1dc70>"
untidy <- as.data.table(untidy)
tracemem[0x10ba1dc70 -> 0x10b8ec830]: copy as.data.table.data.frame as.data.table 
tracemem[0x10ba1dc70 -> 0x1b33eabc0]: as.list.data.frame as.list vapply copy as.data.table.data.frame as.data.table 
gc()
           used  (Mb) gc trigger   (Mb) limit (Mb) max used  (Mb)
Ncells  3858796 206.1   11679122  623.8         NA  3863980 206.4
Vcells 17994682 137.3  578803692 4416.0      16384 18463617 140.9
untidy           #Take a look
#Clean Data 
#Change variables to start with lowercase/ make easier to use
Legislation <-
  untidy %>%
  rename(year = Year, quarter = Quarter, state = LocationAbbr, healthTopic = HealthTopic, policyTopic = PolicyTopic, dataSource = DataSource, setting = Setting, title = Title, status = Status, citation = Citation, statusAltValue = StatusAltValue, dataType = DataType, comments = Comments, enDate = EnactedDate, effDate = EffectiveDate, coordinates = GeoLocation, display = DisplayOrder)
Legislation

What type of information can we find with this data?

We can see how much legislation has been passed in certain years, which years have been the most progressive, which states have more legislation than others, which topic has been most popular, and more. This information can give government officials the ability to see where certain states need to improve, which topics need to be emphasized, and more.

Which year passed the most legislation?

yearlyData <-
Legislation %>%
  group_by(year) %>%
  summarise(total = sum(year)) %>%
  mutate(total = total/1000)
yearlyData %>%
  ggplot(aes(x=year,y=total ))+
  geom_bar(stat='identity',position='stack', width=.9, fill = "red") +
  ylab("Total (1000s)") +
  xlab("Year") 

Each bar in the chart represents one year. One can see that in 2011, the most legislation/regulation has been passed between 2001 and 2017.

Legislation vs. Regulation by Topic

Legislation %>%
  group_by(year, healthTopic, PolicyTypeID) %>%
  summarise(total = sum(year)) %>%
  mutate(total = total/1000) %>%
  ggplot(aes(x= year, y = total)) +
  geom_point(color = "red") +
  facet_grid(PolicyTypeID ~ healthTopic) +
  ylab("total (1000s)") +
  xlab("Year") 

As we can see, there has been a lot less regulation than legislation. Nutrititional legislation appears to be the most popular as represented by the many glyphs on the graph. Regulation is less popular and shows little to no type of trend in the graph. In each of the graphs, around 2011 there is peak which corresponds with our finding earlier about 2011 being the most popular year.

Legislation on a map

stateData <-
Legislation %>%
  group_by(state, year) %>%           #sort cases by state and year
  mutate(total = sum(year))        #create new variable total
stateData    #look at data
  
USMap(data = stateData, key = "state", fill = "total")   #plot on a USMap
Mapping API still under development and may change in future releases.

From this graph, we can see that states like New York, California, Washington, and Illnois, and Hawaii are weaker than most when it comes to passing legislation on nutrition, obesity, and physical activity.

Timeline of Legislation By Setting

#sort cases by year, setting, and policy type ID
Legislation %>%
  group_by(year, setting, PolicyTypeID) %>%
  summarise(total = sum(year)) %>%
  mutate(total = total/1000) %>%
  ggplot(aes(x = year, y = total, color = setting)) +
  geom_line() +
  ylab("total (1000s)") +
  facet_wrap(~PolicyTypeID)

In this graph we still see a peak in 2011, but we can also now see what the bills were passed for. Community takes the lead in both regulation and legislation.

Predict Year from Status and State

#subset legislation's data to optimize r running time
smaller <-
Legislation %>%
  filter(healthTopic == "Nutrition") %>% 
  filter( status != "Introduced") %>%
  filter( status != "Vetoed") %>%
  filter( state %in% c("PA", "CA"))
#create tree
statusPrediction <- 
  party::ctree(
    year ~ state + status,
    data = smaller)
#plot tree
plot(statusPrediction, type = "simple")

From this graph we conclude that a dead bill from California is likely to be from 2010 while a dead bill from Pennsylvania is likely to be from 2009.

LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMjIEhlYWx0aCBQb2xpY3kgaW4gdGhlIFVuaXRlZCBTdGF0ZXMKCmBgYHtyfQpsaWJyYXJ5KGRhdGEudGFibGUpICAgICAgICNMb2FkIGFsbCByZXF1aXJlZCBwYWNrYWdlcwpsaWJyYXJ5KHJ2ZXN0KQpsaWJyYXJ5KGx1YnJpZGF0ZSkKbGlicmFyeShkYnBseXIpCmxpYnJhcnkoRGF0YUNvbXB1dGluZykKbGlicmFyeShtYWdyaXR0cikKbGlicmFyeShIbWlzYykKbGlicmFyeShwYXJ0eSkKbGlicmFyeShycGFydCkKbGlicmFyeShycGFydC5wbG90KQoKcm0obGlzdCA9IGxzKCkpICAgICAgICAgICAgI0NsZWFuIHVwIGVudmlyb25tZW50CgpgYGAKCgojIyNCYWNrZ3JvdW5kIEluZm9ybWF0aW9uCgpUaGlzIGRhdGFzZXQgY29udGFpbnMgcG9saWN5IGRhdGEgZm9yIDUwIFVTIHN0YXRlcyBhbmQgREMgZnJvbSAyMDAxIHRvIDIwMTcuIERhdGEgaW5jbHVkZSBpbmZvcm1hdGlvbiByZWxhdGVkIHRvIHN0YXRlIGxlZ2lzbGF0aW9uIGFuZCByZWd1bGF0aW9ucyBvbiBudXRyaXRpb24sIHBoeXNpY2FsIGFjdGl2aXR5LCBhbmQgb2Jlc2l0eSBpbiBzZXR0aW5ncyBzdWNoIGFzIGVhcmx5IGNhcmUgYW5kIGVkdWNhdGlvbiBjZW50ZXJzLCByZXN0YXVyYW50cywgc2Nob29scywgd29yayBwbGFjZXMsIGFuZCBvdGhlcnMuIFRvIGlkZW50aWZ5IGluZGl2aWR1YWwgYmlsbHMsIHVzZSB0aGUgaWRlbnRpZmllciBQcm92aXNpb25JRC4gQSBiaWxsIG9yIGNpdGF0aW9uIG1heSBhcHBlYXIgbW9yZSB0aGFuIG9uY2UgYmVjYXVzZSBpdCBjb3VsZCBhcHBseSB0byBtdWx0aXBsZSBoZWFsdGggb3IgcG9saWN5IHRvcGljcywgc2V0dGluZ3MsIG9yIHN0YXRlcy4gCgoKYGBge3J9CgoKZ2V0d2QoKSAgICAgICAgICAgICAgICAgICAgICAgICAgI0dldCB3b3JraW5nIGRpcmVjdG9yeQpzZXR3ZCgifi9Eb3dubG9hZHMiKSAgICAgICAgICAgICAjU2V0IHdvcmtpbmcgZGlyZWN0b3J5IAoKdW50aWR5IDwtcmVhZC5jc3YoIkNEQ19OdXRyaXRpb25fX1BoeXNpY2FsX0FjdGl2aXR5X19hbmRfT2Jlc2l0eV8tX0xlZ2lzbGF0aW9uLmNzdiIpICAgICNSZWFkIGZpbGUgaW4gdGFibGUgZm9ybWF0IGFuZCBjcmVhdGUgYSBkYXRhIGZyYW1lIGZyb20gaXQKCmdjKHJlc2V0PVRSVUUpICAgICAgICAgICAgICAgICAjQ2hhbmdlIGZyb20gZGF0YSBmcmFtZSB0byBkYXRhIHRhYmxlCnRyYWNlbWVtKHVudGlkeSkKdW50aWR5IDwtIGFzLmRhdGEudGFibGUodW50aWR5KQpnYygpCgpgYGAKCgoKYGBge3J9Cgp1bnRpZHkgICAgICAgICAgICNUYWtlIGEgbG9vawoKCmBgYAoKCgoKYGBge3J9CgojQ2xlYW4gRGF0YSAKI0NoYW5nZSB2YXJpYWJsZXMgdG8gc3RhcnQgd2l0aCBsb3dlcmNhc2UvIG1ha2UgZWFzaWVyIHRvIHVzZQoKTGVnaXNsYXRpb24gPC0KICB1bnRpZHkgJT4lCiAgcmVuYW1lKHllYXIgPSBZZWFyLCBxdWFydGVyID0gUXVhcnRlciwgc3RhdGUgPSBMb2NhdGlvbkFiYnIsIGhlYWx0aFRvcGljID0gSGVhbHRoVG9waWMsIHBvbGljeVRvcGljID0gUG9saWN5VG9waWMsIGRhdGFTb3VyY2UgPSBEYXRhU291cmNlLCBzZXR0aW5nID0gU2V0dGluZywgdGl0bGUgPSBUaXRsZSwgc3RhdHVzID0gU3RhdHVzLCBjaXRhdGlvbiA9IENpdGF0aW9uLCBzdGF0dXNBbHRWYWx1ZSA9IFN0YXR1c0FsdFZhbHVlLCBkYXRhVHlwZSA9IERhdGFUeXBlLCBjb21tZW50cyA9IENvbW1lbnRzLCBlbkRhdGUgPSBFbmFjdGVkRGF0ZSwgZWZmRGF0ZSA9IEVmZmVjdGl2ZURhdGUsIGNvb3JkaW5hdGVzID0gR2VvTG9jYXRpb24sIGRpc3BsYXkgPSBEaXNwbGF5T3JkZXIpCgpMZWdpc2xhdGlvbgoKCmBgYAoKCgoKIyMgV2hhdCB0eXBlIG9mIGluZm9ybWF0aW9uIGNhbiB3ZSBmaW5kIHdpdGggdGhpcyBkYXRhPwoKV2UgY2FuIHNlZSBob3cgbXVjaCBsZWdpc2xhdGlvbiBoYXMgYmVlbiBwYXNzZWQgaW4gY2VydGFpbiB5ZWFycywgd2hpY2ggeWVhcnMgaGF2ZSBiZWVuIHRoZSBtb3N0IHByb2dyZXNzaXZlLCB3aGljaCBzdGF0ZXMgaGF2ZSBtb3JlIGxlZ2lzbGF0aW9uIHRoYW4gb3RoZXJzLCB3aGljaCB0b3BpYyBoYXMgYmVlbiBtb3N0IHBvcHVsYXIsIGFuZCBtb3JlLiBUaGlzIGluZm9ybWF0aW9uIGNhbiBnaXZlIGdvdmVybm1lbnQgb2ZmaWNpYWxzIHRoZSBhYmlsaXR5IHRvIHNlZSB3aGVyZSBjZXJ0YWluIHN0YXRlcyBuZWVkIHRvIGltcHJvdmUsIHdoaWNoIHRvcGljcyBuZWVkIHRvIGJlIGVtcGhhc2l6ZWQsIGFuZCBtb3JlLgoKCgoKCgojIyBXaGljaCB5ZWFyIHBhc3NlZCB0aGUgbW9zdCBsZWdpc2xhdGlvbj8KCgpgYGB7cn0KCnllYXJseURhdGEgPC0KTGVnaXNsYXRpb24gJT4lCiAgZ3JvdXBfYnkoeWVhcikgJT4lCiAgc3VtbWFyaXNlKHRvdGFsID0gc3VtKHllYXIpKSAlPiUKICBtdXRhdGUodG90YWwgPSB0b3RhbC8xMDAwKQoKCnllYXJseURhdGEgJT4lCiAgZ2dwbG90KGFlcyh4PXllYXIseT10b3RhbCApKSsKICBnZW9tX2JhcihzdGF0PSdpZGVudGl0eScscG9zaXRpb249J3N0YWNrJywgd2lkdGg9LjksIGZpbGwgPSAicmVkIikgKwogIHlsYWIoIlRvdGFsICgxMDAwcykiKSArCiAgeGxhYigiWWVhciIpIAoKYGBgCgpFYWNoIGJhciBpbiB0aGUgY2hhcnQgcmVwcmVzZW50cyBvbmUgeWVhci4gT25lIGNhbiBzZWUgdGhhdCBpbiAyMDExLCB0aGUgbW9zdCBsZWdpc2xhdGlvbi9yZWd1bGF0aW9uIGhhcyBiZWVuIHBhc3NlZCBiZXR3ZWVuIDIwMDEgYW5kIDIwMTcuCgoKCiMjIExlZ2lzbGF0aW9uIHZzLiBSZWd1bGF0aW9uIGJ5IFRvcGljCgpgYGB7cn0KCkxlZ2lzbGF0aW9uICU+JQogIGdyb3VwX2J5KHllYXIsIGhlYWx0aFRvcGljLCBQb2xpY3lUeXBlSUQpICU+JQogIHN1bW1hcmlzZSh0b3RhbCA9IHN1bSh5ZWFyKSkgJT4lCiAgbXV0YXRlKHRvdGFsID0gdG90YWwvMTAwMCkgJT4lCiAgZ2dwbG90KGFlcyh4PSB5ZWFyLCB5ID0gdG90YWwpKSArCiAgZ2VvbV9wb2ludChjb2xvciA9ICJyZWQiKSArCiAgZmFjZXRfZ3JpZChQb2xpY3lUeXBlSUQgfiBoZWFsdGhUb3BpYykgKwogIHlsYWIoInRvdGFsICgxMDAwcykiKSArCiAgeGxhYigiWWVhciIpIAoKCgpgYGAKCgpBcyB3ZSBjYW4gc2VlLCB0aGVyZSBoYXMgYmVlbiBhIGxvdCBsZXNzIHJlZ3VsYXRpb24gdGhhbiBsZWdpc2xhdGlvbi4gTnV0cml0aXRpb25hbCBsZWdpc2xhdGlvbiBhcHBlYXJzIHRvIGJlIHRoZSBtb3N0IHBvcHVsYXIgYXMgcmVwcmVzZW50ZWQgYnkgdGhlIG1hbnkgZ2x5cGhzIG9uIHRoZSBncmFwaC4gUmVndWxhdGlvbiBpcyBsZXNzIHBvcHVsYXIgYW5kIHNob3dzIGxpdHRsZSB0byBubyB0eXBlIG9mIHRyZW5kIGluIHRoZSBncmFwaC4gSW4gZWFjaCBvZiB0aGUgZ3JhcGhzLCBhcm91bmQgMjAxMSB0aGVyZSBpcyBwZWFrIHdoaWNoIGNvcnJlc3BvbmRzIHdpdGggb3VyIGZpbmRpbmcgZWFybGllciBhYm91dCAyMDExIGJlaW5nIHRoZSBtb3N0IHBvcHVsYXIgeWVhci4KCgoKCgoKCiMjIExlZ2lzbGF0aW9uIG9uIGEgbWFwCgpgYGB7cn0Kc3RhdGVEYXRhIDwtCkxlZ2lzbGF0aW9uICU+JQogIGdyb3VwX2J5KHN0YXRlLCB5ZWFyKSAlPiUgICAgICAgICAgICNzb3J0IGNhc2VzIGJ5IHN0YXRlIGFuZCB5ZWFyCiAgbXV0YXRlKHRvdGFsID0gc3VtKHllYXIpKSAgICAgICAgI2NyZWF0ZSBuZXcgdmFyaWFibGUgdG90YWwKCnN0YXRlRGF0YSAgICAjbG9vayBhdCBkYXRhCiAgClVTTWFwKGRhdGEgPSBzdGF0ZURhdGEsIGtleSA9ICJzdGF0ZSIsIGZpbGwgPSAidG90YWwiKSAgICNwbG90IG9uIGEgVVNNYXAKCgpgYGAKCkZyb20gdGhpcyBncmFwaCwgd2UgY2FuIHNlZSB0aGF0IHN0YXRlcyBsaWtlIE5ldyBZb3JrLCBDYWxpZm9ybmlhLCBXYXNoaW5ndG9uLCBhbmQgSWxsbm9pcywgYW5kIEhhd2FpaSBhcmUgd2Vha2VyIHRoYW4gbW9zdCB3aGVuIGl0IGNvbWVzIHRvIHBhc3NpbmcgbGVnaXNsYXRpb24gb24gbnV0cml0aW9uLCBvYmVzaXR5LCBhbmQgcGh5c2ljYWwgYWN0aXZpdHkuCgoKCgoKCiMjIFRpbWVsaW5lIG9mIExlZ2lzbGF0aW9uIEJ5IFNldHRpbmcKCmBgYHtyfQoKI3NvcnQgY2FzZXMgYnkgeWVhciwgc2V0dGluZywgYW5kIHBvbGljeSB0eXBlIElECgpMZWdpc2xhdGlvbiAlPiUKICBncm91cF9ieSh5ZWFyLCBzZXR0aW5nLCBQb2xpY3lUeXBlSUQpICU+JQogIHN1bW1hcmlzZSh0b3RhbCA9IHN1bSh5ZWFyKSkgJT4lCiAgbXV0YXRlKHRvdGFsID0gdG90YWwvMTAwMCkgJT4lCiAgZ2dwbG90KGFlcyh4ID0geWVhciwgeSA9IHRvdGFsLCBjb2xvciA9IHNldHRpbmcpKSArCiAgZ2VvbV9saW5lKCkgKwogIHlsYWIoInRvdGFsICgxMDAwcykiKSArCiAgZmFjZXRfd3JhcCh+UG9saWN5VHlwZUlEKQoKCmBgYAoKSW4gdGhpcyBncmFwaCB3ZSBzdGlsbCBzZWUgYSBwZWFrIGluIDIwMTEsIGJ1dCB3ZSBjYW4gYWxzbyBub3cgc2VlIHdoYXQgdGhlIGJpbGxzIHdlcmUgcGFzc2VkIGZvci4gQ29tbXVuaXR5IHRha2VzIHRoZSBsZWFkIGluIGJvdGggcmVndWxhdGlvbiBhbmQgbGVnaXNsYXRpb24uIAoKCgoKCgoKIyNQcmVkaWN0IFllYXIgZnJvbSBTdGF0dXMgYW5kIFN0YXRlCgpgYGB7cn0KCiNzdWJzZXQgbGVnaXNsYXRpb24ncyBkYXRhIHRvIG9wdGltaXplIHIgcnVubmluZyB0aW1lCnNtYWxsZXIgPC0KTGVnaXNsYXRpb24gJT4lCiAgZmlsdGVyKGhlYWx0aFRvcGljID09ICJOdXRyaXRpb24iKSAlPiUgCiAgZmlsdGVyKCBzdGF0dXMgIT0gIkludHJvZHVjZWQiKSAlPiUKICBmaWx0ZXIoIHN0YXR1cyAhPSAiVmV0b2VkIikgJT4lCiAgZmlsdGVyKCBzdGF0ZSAlaW4lIGMoIlBBIiwgIkNBIikpCgojY3JlYXRlIHRyZWUKc3RhdHVzUHJlZGljdGlvbiA8LSAKICBwYXJ0eTo6Y3RyZWUoCiAgICB5ZWFyIH4gc3RhdGUgKyBzdGF0dXMsCiAgICBkYXRhID0gc21hbGxlcikKCiNwbG90IHRyZWUKcGxvdChzdGF0dXNQcmVkaWN0aW9uLCB0eXBlID0gInNpbXBsZSIpCgoKYGBgCgpGcm9tIHRoaXMgZ3JhcGggd2UgY29uY2x1ZGUgdGhhdCBhIGRlYWQgYmlsbCBmcm9tIENhbGlmb3JuaWEgaXMgbGlrZWx5IHRvIGJlIGZyb20gMjAxMCB3aGlsZSBhIGRlYWQgYmlsbCBmcm9tIFBlbm5zeWx2YW5pYSBpcyBsaWtlbHkgdG8gYmUgZnJvbSAyMDA5LgoKCgoKCgoKCgoKCgoKCgoKCg==